fix: validate tensor dims_count against the shared-memory region#438
Open
LinZiyuu wants to merge 1 commit into
Open
fix: validate tensor dims_count against the shared-memory region#438LinZiyuu wants to merge 1 commit into
LinZiyuu wants to merge 1 commit into
Conversation
PbTensor::LoadFromSharedMemory() reads `dims_count` from shared memory and uses it to compute `name_offset` and to construct the dims vector (`std::vector<int64_t>(dims_ptr, dims_ptr + dims_count)`) without checking it against the region. A corrupted `dims_count` (e.g. written by a model into the backend shm) makes the parent process perform a large out-of-bounds read, crashing the server; the `sizeof(int64_t) * dims_count` product can also overflow and yield a small, controlled `name_offset`. Validate `dims_count` so that the dims array stays within the shared-memory region before it is used, throwing PythonBackendException otherwise. The check uses division to avoid overflowing the product, and mirrors the MemoryShm::byte_size boundary check. Valid tensors are unaffected. Signed-off-by: LinZiyuu <linziyu0205@163.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This validates the tensor
dims_countread from shared memory before it is used, extending the shared-memory boundary validation added in #405 / #406.Previously,
PbTensor::LoadFromSharedMemory()tookdims_countfrom the shared-memory region and used it to computename_offsetand to build the dims vector (std::vector<int64_t>(dims_ptr, dims_ptr + dims_count)) with no bounds check. A corrupteddims_countmakes the parent process perform a large out-of-bounds read and crash the server, and thesizeof(int64_t) * dims_countproduct can overflow into a small, controlledname_offset. The #406 fix boundedMemoryShm::byte_size, but not this dimension count.This change validates
dims_countso the dims array stays within the region before it is used, throwingPythonBackendExceptionotherwise. The check uses division to avoid overflowing the product, and mirrors theMemoryShm::byte_sizeboundary check. Valid tensors are unaffected.Reproduced on
nvcr.io/nvidia/tritonserver:26.04-py3(CPU): a model that overwrites a live output tensor'sdims_countin the backend shared memory crashes the whole server (Exited (139), SIGSEGV) within seconds, viapython_be.cc→InferResponse::LoadFromSharedMemory→PbTensor::LoadFromSharedMemory. With this change the corrupted tensor is rejected with an error instead of faulting.The sibling unbounded values read from shared memory have the same pattern and are worth a follow-up:
InferResponse::outputs_size,InferRequest::requested_output_count/input_count,PbMap::length,MessageQueue::Pop's tail, and the object handles passed toSharedMemoryManager::Load<T>.